This analysis explores political contributions from the State of California to the candidates in the 2016 United States presidential election. The dataset contains records from December 1, 2014 to August 31, 2016.
## [1] 923305 18
## 'data.frame': 923305 obs. of 18 variables:
## $ cmte_id : Factor w/ 24 levels "C00458844","C00500587",..: 6 6 6 7 7 7 7 6 7 7 ...
## $ cand_id : Factor w/ 24 levels "P00003392","P20002671",..: 1 1 1 12 12 12 12 1 12 12 ...
## $ cand_nm : Factor w/ 24 levels "Bush, Jeb","Carson, Benjamin S.",..: 4 4 4 19 19 19 19 4 19 19 ...
## $ contbr_nm : Factor w/ 167608 levels " ALERIS, ANNAKIM",..: 5606 22857 50822 85586 86657 86657 86680 66908 86695 86712 ...
## $ contbr_city : Factor w/ 2015 levels "","-4086",".",..: 899 253 575 252 1433 1433 1907 863 1957 1311 ...
## $ contbr_st : Factor w/ 1 level "CA": 1 1 1 1 1 1 1 1 1 1 ...
## $ contbr_zip : Factor w/ 116642 levels "","00000","000090272",..: 93276 58820 43689 54291 14092 14092 37220 47851 50562 94383 ...
## $ contbr_employer : Factor w/ 50335 levels ""," APPLE INC.",..: 29687 29687 29687 3425 47690 47690 31726 29687 30723 39071 ...
## $ contbr_occupation: Factor w/ 22409 levels ""," REAL ESTATE BROKER",..: 16817 16817 16817 18735 14081 14081 15401 16817 13020 5732 ...
## $ contb_receipt_amt: num 50 200 5 40 35 100 25 40 10 15 ...
## $ contb_receipt_dt : Factor w/ 610 levels "01-APR-15","01-APR-16",..: 496 370 23 72 91 111 72 370 91 111 ...
## $ receipt_desc : Factor w/ 74 levels "","* EARMARKED CONTRIBUTION: SEE BELOW REATTRIBUTION/REFUND PENDING",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ memo_cd : Factor w/ 2 levels "","X": 2 2 2 1 1 1 1 2 1 1 ...
## $ memo_text : Factor w/ 323 levels "","*","* $550 REFUNDED 6/16/16",..: 29 29 29 4 4 4 4 29 4 4 ...
## $ form_tp : Factor w/ 3 levels "SA17A","SA18",..: 2 2 2 1 1 1 1 2 1 1 ...
## $ file_num : int 1091718 1091718 1091718 1077404 1077404 1077404 1077404 1091718 1077404 1077404 ...
## $ tran_id : Factor w/ 920319 levels "A000771210424405B8CF",..: 146834 146116 143498 656792 658236 660537 656254 146154 658232 661449 ...
## $ election_tp : Factor w/ 4 levels "","G2016","P2016",..: 3 3 3 3 3 3 3 3 3 3 ...
## cmte_id cand_id cand_nm
## C00577130:407151 P60007168:407151 Sanders, Bernard :407151
## C00575795:356696 P00003392:356696 Clinton, Hillary Rodham :356696
## C00574624: 57820 P60006111: 57820 Cruz, Rafael Edward 'Ted': 57820
## C00580100: 40507 P80001571: 40507 Trump, Donald J. : 40507
## C00573519: 27348 P60005915: 27348 Carson, Benjamin S. : 27348
## C00458844: 14092 P60006723: 14092 Rubio, Marco : 14092
## (Other) : 19691 (Other) : 19691 (Other) : 19691
## contbr_nm contbr_city contbr_st
## MITCHELL, MARCIA: 301 LOS ANGELES : 70402 CA:923305
## SAMATUA, DENISE : 285 SAN FRANCISCO: 61482
## CARROLL, TERI : 279 SAN DIEGO : 32808
## SMITH, CHERYL : 268 OAKLAND : 23149
## WEIL, MONIQUE : 257 SAN JOSE : 21668
## PENDERGAST, JAN : 245 SACRAMENTO : 16690
## (Other) :921670 (Other) :697106
## contbr_zip contbr_employer contbr_occupation
## 92660 : 339 N/A : 96086 RETIRED :165329
## 926372766: 335 RETIRED : 91615 NOT EMPLOYED :110591
## 900363146: 301 NONE : 84084 ATTORNEY : 22786
## 92037 : 297 SELF-EMPLOYED: 65433 TEACHER : 20945
## 916055507: 293 NOT EMPLOYED : 46576 INFORMATION REQUESTED: 16590
## 951331302: 285 (Other) :538948 (Other) :586929
## (Other) :921455 NA's : 563 NA's : 135
## contb_receipt_amt contb_receipt_dt
## Min. :-10500.0 29-FEB-16: 11735
## 1st Qu.: 15.0 31-MAR-16: 11506
## Median : 27.0 31-MAY-16: 10435
## Mean : 122.5 30-APR-16: 9479
## 3rd Qu.: 80.0 08-JUN-16: 8900
## Max. : 10800.0 09-MAR-16: 8887
## (Other) :862363
## receipt_desc memo_cd
## :909842 :822515
## Refund : 6986 X:100790
## REDESIGNATION FROM PRIMARY : 1324
## REDESIGNATION TO GENERAL : 1324
## REATTRIBUTION / REDESIGNATION REQUESTED: 569
## REDESIGNATION TO CRUZ FOR SENATE : 544
## (Other) : 2716
## memo_text form_tp
## :463672 SA17A:821932
## * EARMARKED CONTRIBUTION: SEE BELOW:388137 SA18 : 94387
## * HILLARY VICTORY FUND : 61871 SB28A: 6986
## REDESIGNATION FROM PRIMARY : 1324
## REDESIGNATION TO GENERAL : 1324
## EARMARKED FROM MAKE DC LISTEN : 858
## (Other) : 6119
## file_num tran_id election_tp
## Min. :1003942 A5602AD777C8C4632B5A: 4 : 181
## 1st Qu.:1077572 ADB49CB248C174E298F0: 4 G2016:113688
## Median :1081057 A26C35A6066754130B99: 3 P2016:809429
## Mean :1083557 A340DF85B7F884133A20: 3 P2020: 7
## 3rd Qu.:1096298 A4E50E2DD07E4475996F: 3
## Max. :1100920 A7C22FA389E0348F98F0: 3
## (Other) :923285
## cmte_id cand_id cand_nm contbr_nm
## 1 C00575795 P00003392 Clinton, Hillary Rodham AULL, ANNE
## 2 C00575795 P00003392 Clinton, Hillary Rodham CARROLL, MARYJEAN
## 3 C00575795 P00003392 Clinton, Hillary Rodham GANDARA, DESIREE
## 4 C00577130 P60007168 Sanders, Bernard LEE, ALAN
## 5 C00577130 P60007168 Sanders, Bernard LEONELLI, ODETTE
## 6 C00577130 P60007168 Sanders, Bernard LEONELLI, ODETTE
## contbr_city contbr_st contbr_zip contbr_employer
## 1 LARKSPUR CA 949391913 N/A
## 2 CAMBRIA CA 934284638 N/A
## 3 FONTANA CA 923371507 N/A
## 4 CAMARILLO CA 930111214 AT&T GOVERNMENT SOLUTIONS
## 5 REDONDO BEACH CA 902784310 VERICOR ENTERPRISES INC.
## 6 REDONDO BEACH CA 902784310 VERICOR ENTERPRISES INC.
## contbr_occupation contb_receipt_amt contb_receipt_dt receipt_desc
## 1 RETIRED 50 26-APR-16
## 2 RETIRED 200 20-APR-16
## 3 RETIRED 5 02-APR-16
## 4 SOFTWARE ENGINEER 40 04-MAR-16
## 5 PHARMACIST 35 05-MAR-16
## 6 PHARMACIST 100 06-MAR-16
## memo_cd memo_text form_tp file_num tran_id
## 1 X * HILLARY VICTORY FUND SA18 1091718 C4768722
## 2 X * HILLARY VICTORY FUND SA18 1091718 C4747242
## 3 X * HILLARY VICTORY FUND SA18 1091718 C4666603
## 4 * EARMARKED CONTRIBUTION: SEE BELOW SA17A 1077404 VPF7BKWA097
## 5 * EARMARKED CONTRIBUTION: SEE BELOW SA17A 1077404 VPF7BKX3MB3
## 6 * EARMARKED CONTRIBUTION: SEE BELOW SA17A 1077404 VPF7BKYBXV4
## election_tp
## 1 P2016
## 2 P2016
## 3 P2016
## 4 P2016
## 5 P2016
## 6 P2016
This data set has 923305 observations of 18 variables. There is a lot of useful data here so let’s start digging!
Most contributions were small sums of money. There were negative values in there (probably refunds) so those were excluded from the histogram. A log transformation of the donation amount shows that the majority of donations were between $10 and $100. Let’s find out the descriptive statistics for the amount of money donated, minus the refunds.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.01 15.00 27.00 130.50 80.00 10800.00
The stats confirm that there were many more small donations compared to large donations. The mean, median, and even the third quartile are much closer to the minimum value than the maximum value.
Let’s see who received the most donations.
When presented on a log scale, it seems like the candidates received an approximately equal number of donations; but, a continuous count scale shows that Bernard Sanders and Hillary Rotham Clinton received many more donations than everyone else.
Let’s find out which political party received the most donations from the State of California.
Let’s see the top 10 cities with the most contributors, the top 10 employers, and top 10 occupations of the contributors.
It seems that retired, self-employed, or unemployed people contributed the most. Or maybe they did not want to publicly announce the company that they work for. There are no specific companies in the top 10 contributors list. If I expand the list to 15, then Kaiser, Google, and Stanford University appear.
Retired and self-employed Californians contributed the most when looking at occupation or employer statistics. I wonder which candidate retired and self-employed people contributed to the most.
Interesting, although Bernie Sanders received the most number of contributions, it looks like retired and self-employed people contributed to Hillary Clinton the most. Donald Trump also received more donations than Bernie Sanders in this demographic. This also fits the narrative that young people supported Bernie Sanders more than old people (who are more likely to be retired).
This dataset has 923305 donation records with 18 different variables. The variables include information about the candidates and the contributers. There is more information about the contributors such as the contributor’s occupation, employment, and city of residence.
Most of the variables are factors with varying levels; therefore, all except one variable are categorical. The only continuous variable is the amount of money donated to the candidates.
Observations:
The median contribution is $27 when refunds (negative contributions) are factored in.
Excluding refunds, the mean, median, and third quartile amount for contributions is much less than the maximum amount.
There are many more presidential candidates from the Republican party than any other.
Democratic candidates received many more donations than candidates from other parties.
Retired and self-employed Californians gave more contributions to and clearly favor Hillary Clinton.
The main features of this dataset are the candidates and the donations they received from people within California. This record of donations may be able to predict the voting sentiment in the State of California. It is important to remember that the donations may not imply a correlation with votes. Donors may not be United States citizens or unable to vote for other reasons during the general election.
The contributor’s information such as city of residence, occupation, and employer as well as the political affiliation of the candidates are good supporting features to understanding the political map of the State of California.
I created a column for the political party that each candidate belongs to. This column was filled with information obtained from outside of this dataset.
I log transformed the histogram of the contribution amount. The histogram on the continuous scale was negatively skewed but the log transformed histogram showed a normal distribution with peak around $30. So it seems that the candidates received many number of donations with small sums of money.
I also log transformed the y-axis of the number of donations for each candidate. Although there is no statistical meaning for this, the high number of donations to Bernie Sanders and Hillary Clinton made it hard to understand the number of donations that the other candidates received on a continuous y scale.
The boxplot of value of donations for each candidate had large outliers so a second plot shows the 95% quantile of the data as the upper limit on the y-axis. The original box plot with the full dataset also shows negative values, which are probably refunds requested by the donors. These refunds are not included in the 95% quartile.
## donations$cand_nm: Bush, Jeb
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -5400 50 500 1068 2700 10000
## --------------------------------------------------------
## donations$cand_nm: Carson, Benjamin S.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -10000.0 25.0 50.0 107.5 100.0 10000.0
## --------------------------------------------------------
## donations$cand_nm: Christie, Christopher J.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700 100 1000 1370 2700 5400
## --------------------------------------------------------
## donations$cand_nm: Clinton, Hillary Rodham
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -5400.0 11.0 25.0 178.6 100.0 10000.0
## --------------------------------------------------------
## donations$cand_nm: Cruz, Rafael Edward 'Ted'
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -8300.00 25.00 50.00 99.19 100.00 10800.00
## --------------------------------------------------------
## donations$cand_nm: Fiorina, Carly
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -3300.0 25.0 100.0 312.7 250.0 5400.0
## --------------------------------------------------------
## donations$cand_nm: Gilmore, James S III
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2700 2700 2700 2700 2700 2700
## --------------------------------------------------------
## donations$cand_nm: Graham, Lindsey O.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700 100 1000 1195 2700 8100
## --------------------------------------------------------
## donations$cand_nm: Huckabee, Mike
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 25.0 50.0 434.8 500.0 5400.0
## --------------------------------------------------------
## donations$cand_nm: Jindal, Bobby
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 11.1 250.0 250.0 749.4 1000.0 2700.0
## --------------------------------------------------------
## donations$cand_nm: Johnson, Gary
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -607.90 74.61 200.00 396.60 500.00 2742.00
## --------------------------------------------------------
## donations$cand_nm: Kasich, John R.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -5400.0 50.0 100.0 505.7 500.0 2700.0
## --------------------------------------------------------
## donations$cand_nm: Lessig, Lawrence
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 50.0 250.0 500.4 500.0 2700.0
## --------------------------------------------------------
## donations$cand_nm: O'Malley, Martin Joseph
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 50.0 250.0 750.2 1000.0 5400.0
## --------------------------------------------------------
## donations$cand_nm: Pataki, George E.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 100 500 1000 1522 2700 2700
## --------------------------------------------------------
## donations$cand_nm: Paul, Rand
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -5400.0 25.0 50.0 187.2 100.0 5400.0
## --------------------------------------------------------
## donations$cand_nm: Perry, James R. (Rick)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700 1000 2700 1797 2700 2700
## --------------------------------------------------------
## donations$cand_nm: Rubio, Marco
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -5400.0 25.0 75.0 343.6 250.0 5400.0
## --------------------------------------------------------
## donations$cand_nm: Sanders, Bernard
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -10500.00 15.00 27.00 48.21 50.00 10000.00
## --------------------------------------------------------
## donations$cand_nm: Santorum, Richard J.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5.0 25.0 100.0 427.4 300.0 2700.0
## --------------------------------------------------------
## donations$cand_nm: Stein, Jill
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -300.0 50.0 100.0 185.2 250.0 2700.0
## --------------------------------------------------------
## donations$cand_nm: Trump, Donald J.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -3716.0 28.0 40.0 152.4 116.6 5400.0
## --------------------------------------------------------
## donations$cand_nm: Walker, Scott
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -5400.0 100.0 250.0 676.1 1000.0 10800.0
## --------------------------------------------------------
## donations$cand_nm: Webb, James Henry Jr.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5.0 100.0 275.0 722.3 875.0 5400.0
It seems that Republican candidates received higher dollar value donations than Democratic candidates based on their mean, median, and third quartile statistics. Let’s confirm this.
## donations$party: Democratic
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -10500.0 14.0 27.0 109.4 50.0 10000.0
## --------------------------------------------------------
## donations$party: Libertarian
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -607.90 74.61 200.00 396.60 500.00 2742.00
## --------------------------------------------------------
## donations$party: Green
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -300.0 50.0 100.0 185.2 250.0 2700.0
## --------------------------------------------------------
## donations$party: Republican
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -10000.0 25.0 50.0 184.2 100.0 10800.0
## --------------------------------------------------------
## donations$party: TRUE
## NULL
Although Democratic candidates received more number of donations, Republican candidates received higher value donations with a higher median, mean, and quartile.
I wonder which candidate received the highest sum of donations.
Even though Bernie Sanders received the highest number of donations, Hillary Clinton received a much higher sum. This dataset has donation records from Dec 1, 2014 to August 31, 2016. Hillary Clinton was nominated as the Democratic candidate for president on July 26, 2016 so that could explain some of the gap in donation amount between her and Bernie Sanders. However, it seems that Hillary Clinton’s donors made larger donations to her campaign overall than Bernie Sanders’ donors. This can also be seen in the donation summary a few blocks above.
Similar to the univariate analysis, let’s look at the top 10 cities, occupations, and employers for the sum of money donated.
Many of the same cities and professions show up in the top 10 list for sum of donations as the number of donations. However, the difference between the value of the donations among cities stands out a lot more than the number of donations from the previous section. It also seems like donors are more hesitant to name their employer than their occupation since there isn’t a single specific employer name in the top 10 list. Retired Californians seem more invested in the presidential race from their actions, as previously noticed.
With the Democratic and Republican nominees selected, I’d like to see if there are pockets in California that favor one candidate over the other. Let’s find out the top 5 cities that donated the most money to Hillary Clinton and Donald Trump.
It is clear to see that some areas of California favor one candidate more than other areas. The donations from all cities are overwhelmingly for Hillary Clinton but southern California seems to have more Donald Trump supporters than Northern California.
I continued to discover that both the number and value of the donations was in favor of two Democratic candidates - Hillary Clinton and Bernie Sanders. Something interesting that I noticed is that although Bernie Sanders received more number of donations than Hillary Clinton, Hillary received a much higher sum of donations. Bernie Sanders received $19,630,343 from 407,151 donations while Hillary Clinton received $63,707,471 from 356,696 donations. This shows that Hillary Clinton’s supporters donated more money in each donation. The statistics for mean and median donation to each candidate also corroborate this.
I noticed that although the State of California is in clear favor of Democratic candidates, particularly Hillary Clinton and Bernie Sanders, cities in Southern California prefer the Republican nominee more than cities in Northern California.
Other interesting relationships:
Retired Californians donated the most number of times and the most sum of money.
Donors are more comfortable listing their occupation than their place of employement.
There is a larger gap between the top 10 contributing cities when comparing sums of donations vs number of donations.
This dataset has one continuous variable and 18 categorical variables so it is not possible to statistically determine and rank relationships using measures such as correlation.
Looking at the data objectively, the strongest relationship I found is sum of money donated and Democratic candidates. Hillary Clinton received 56% of all donations from the State of California while the other 23 candidates received the rest.
I want to see the number and sum of donations that each candidate received during the primary and general election cycle.
Looks like Lindsey Graham’s supporters are thinking ahead.
Let’s look at how the top 10 contributing cities and occupations, by sum, donated to each of the party’s candidates and to the parties themselves. Each party’s presidential nominee is: * Democratic - Hillary Clinton * Republican - Donald J. Trump * Libertarian - Gary Johnson * Green Party - Jill Stein
The top 10 contributing cities by sum are:
##
## LOS ANGELES SAN FRANCISCO SAN DIEGO OAKLAND SAN JOSE
## 70402 61482 32808 23149 21668
## SACRAMENTO BERKELEY LONG BEACH SANTA MONICA SANTA BARBARA
## 16690 16668 10730 10016 8827
Again, the contributions from cities in California are overwhelmingly Democratic - both for the party and the nominee.
Some cities have a higher proportion of Republican support, such as San Diego, while others have almost no Republican support, such as Oakland and Berkeley.
##
## RETIRED NOT EMPLOYED ATTORNEY
## 165329 110591 22786
## TEACHER INFORMATION REQUESTED ENGINEER
## 20945 16590 12763
## HOMEMAKER SOFTWARE ENGINEER PHYSICIAN
## 11356 11029 10763
## PROFESSOR
## 9519
From the graphs of occupations split up by support, it looks like the more academic professions are in favor of Democrats and Hillary Clinton. Occupations like Professor, Teacher, and Software Engineer have no measureable support for the Republican party or Donald Trump.
The biggest Republican, and Donald Trump, supporters in this list are in the “Retired” category. People from this list have the most overall contributions as well. Since people retire from all professions, it is difficult to determine the exact make-up of this group. There are definitely sub-groups in this category and they may determine the support for either Donald Trump or Hillary Clinton.
I shifted my attention away from overall support and looked towards the general presidential election, now that the nominees for each party have been selected. My goal to determine voting patterns based on donations from certain areas and demographics of the state. I looked the top 10 cities and occupations that made the most number of donations and plotted the sum of their donations for each of the presidential nominees and each of the political parties. I plotted candidates and parties to see if there was any deviation from normal support considering the nominees for each party.
Hillary Clinton and the Democratic Party are the clear favourites in California and there was never any doubt about that. Every city and occupation in the top 10 number of donations heavily favored Hillary Clinton. However, there are areas where her support is strongest, Berkeley and Oakland, and some areas where Donald Trump has more support, San Diego and Los Angeles, when compared to other areas.
As for occupations, it seems that academic occupations like Professor and Teacher strongly favor the Democratic Party and Hillary Clinton while almost half of retired Californians favor the Republican Party. However, less than half of the retired people in California donated to Donald Trump.
A few surprising interactions I found: * Hillary Clinton received most of her donations during the primary phase of the election. Since Bernie Sanders was a very popular candidate and received more number of donations, there was a possibility that Hillary Clinton could have received most of her donations after she won the Democratic primary. However, it seems like Hillary Clinton has had powerful supporters throughout the election cycle.
I wanted to focus on these plots because they show the difference between the type of support that the candidates received from people in California. It’s clear that the two Democratic candidates, Bernie Sanders and Hillary Clinton, have much more support than their Republican counterparts. The interesting thing is that although Hillary Clinton received slightly fewer donations than Bernie Sanders, she received a much higher sum of money than her main opponent during the Primaries. This indicates that Hillary’s supporters are in a better financial condition and/or more invested in the presidential race than Bernie’s supporters.
This plot shows the difference in the value of single donations that each candidate received. Some candidates received a wide range of donations and have large inter-quartile ranges (IQR). There are also a lot of outliers for each candidate.
From the graph, Bernie Sanders has the smallest IQR ($35) but also a lot of outliers. The largest IQR are for Republican candidates Jeb Bush ($2,650) and Chris Christie ($2,600). Neither of them received a large number of donations so accurate statistics are difficult to determine in their cases.
The final plot I want to highlight shows the sum of donations from the top 10 contributing cities to the presidential candidates of each political party. These plots show several things:
Hillary Clinton is overwhelmingly supported in each of the top 10 cities. She received 86% of the donations from the City of San Diego and that was her lowest proportion of donations from any city.
Almost no support for the Libertarian or Green Party or their nominees. The Libertarian candidate received 0.3% of the donations and the Green Party candidate received 0.01% of the donations.
Very little support for the Republican candidate Donald Trump. The most amount of money he received was from the City of Los Angeles - $334,864.90
California is known as a “Blue”, or Democratic, State and the donation receipt records of Californians definitely supports that title. Donors from California overwhelmingly support the Democratic Party and candidates that run on its platform.
This dataset contains 923,305 donation records to 24 candidates from December 2014 to August 2016. Californians donated a combined $119,000,000 to all of the candidates; although, 1 of the 24 candidates received 56% of that money. There was 1 candidate from the Green and Libertarian Party each, 4 candidates from the Democratic Party, and 18 Republican Candidates. Two Democratic candidates, Bernie Sanders and Hillary Clinton, were clear favourites because they received more donations than all of the other candidates combined. Between the two major Democratic candidates, Bernie Sanders received 50,455 more donations but Hillary Clinton received $44,077,128 more.
There are areas in California that are more favourable to the Republican Party than other areas. The cities of Los Angeles, San Diego, and Newport Beach have a higher percentage of Republican donors than the rest of the cities in California. Conversely, Berkeley and Oakland have almost no Republican supporters.
Retired Californians are more invested in the election than other demographics because they donated the most number of times and the most sum of money to the presidential candidates. People from all professional backgrounds eventually retire so this group had the highest proportion of Republican supporters compared to all of the other professions. Highly academic professions such as Professor or Teacher are overwhelmingly Democratic, as shown by their donations.
It’s no secret that California is a Democratic State, especially with this record of donations. Based on the donation pattern, California will most likely vote for Hillary Clinton this November during the presidential election.
Note: Donation tendencies may not correlate to voting patterns. There are several scenarios where an individual may donate and not vote or vice versa.
The biggest difficulty I had with the analysis of this dataset was plotting the sum of money donated vs other variables. This was especially true when applying a limit on the data such as the sum of money from the top 10 cities or top 10 occupations.
Successes in this project, and I think this is generally true for R, were the ease of plotting data with the ggplot2 package. A lot of information can be plotted with much less syntax using ggplot2 compared to other programming languages.
The analysis can be improved for future work by ordering the receipt date timestamp and plotting the donations to each candidate over time. These could be correlated to political events, such as the RNC or DNC, and to polling results. Polling data from the same communities could also be included in this dataset to compare donations with polling sentiment for each candidate.